Causal models for qualitative and mixed methods inference

Process tracing

Macartan Humphreys and Alan Jacobs

1 Process Tracing

The big picture: Intuition

1.1 Process tracing with a DAG

Simple insight: If you believe this model, then seeing \(M\) should tell you something about a query regarding the \(I\), \(D\) relationship in a case.

For instance, we might have the intuition: If there was no mobilization in a high-inequality case that democratized, then inequality didn’t cause the transition.

But how do we formalize this strategy?

1.2 Process tracing with a DAG

Key formal insight:

If you believe this model, then seeing \(M\) should tell you something about the \(\theta\)s—which are what define the effect of \(I\) on \(D\).

1.3 Process tracing with a Causal Model

  • Great as this is, the DAG–by itself–tells us nothing about the direction of effects
    • Inequality might boost or depress mobilization
    • Mobilization might encourage or delay democratization
  • For process-tracing, you need an informative causal model

1.4 Process tracing as new data for finer inferences

  • We are focusing on a case

  • We start with two things:

    • A background theory about how things work in cases like this
    • Background knowledge about the case: perhaps the outcome; perhaps the value of a causal condition or two
  • What we don’t know is whether any of those conditions caused the outcome

  • So we collect more information “from other parts of the DAG” to help figure that out

2 Example: Democratization in Malawi (1994)

Let’s walk though the intuition

2.1 A case-level question: Malawi (1994)

  • Suppose we observe democratization as the outcome in Malawi: \(D=1\)

  • We also observe high inequality in Malawi: \(I=1\)

  • We want to know: did \(I=1\) cause \(D=1\) in Malawi?

2.2 Prior data on Malawi

  • Let’s process-trace: observe \(M\) in Malawi
    • Was there mass mobilization in Malawi?

2.3 New within-case data on Malawi

  • Suppose we go to the field and we learn that mass mobilization DID occur in Malawi

    • So \(M=1\)
  • What can we conclude?

  • NOTHING YET!

2.4 Inference

  • We knew \(I=1\), \(D=1\)

  • We then saw \(M=1\)

  • But which is more consistent with \(I=1\) causing \(D=1\)?

    • Mobilization occurring? or
    • Mobilization not occurring?

2.5 Why prior theory matters for process tracing

  • Suppose I believe that high inequality usually prevents mass mobilization
    • Poverty makes it harder to mobilize
  • Suppose I also believe that mobilization usually prevents democratization
    • Instills fear in autocrats and provokes repression
  • Which value of \(M\) would be most consistent with \(I=1\) causing \(D=1\)?
    • \(M=0\)

2.6 Why prior theory matters for process tracing

  • Suppose I believe that high inequality usually causes mass mobilization
    • Inequality creates grievances for protest
  • Suppose I also believe that mobilization usually causes democratization
    • Instills fear in autocrats and provokes concessions
  • Which value of \(M\) would be most consistent with \(I=1\) causing \(D=1\)?
    • \(M=1\)

2.7 Why prior theory matters for process tracing

  • If I believe inequality usually causes democratization by preventing mobilization, which itself hinders democratization…
    • then observing \(M=0\) constitutes evidence that \(I=1\) caused \(D=1\)
  • If I believe inequality usually causes democratization by causing mobilization, which itself causes democratization…
    • then observing \(M=1\) constitutes evidence that \(I=1\) caused \(D=1\)

2.8 Two theories of how \(I=1\) could cause \(D=1\)

  • Chain of positive effects
    • Inequality boosts mobilization
    • Mobilization causes democratization
  • Chain of negative effects
    • Inequality depresses mobilization
    • Mobilization prevents democratization

2.9 Interpretation of PT evidence depends on theory

In particular: beliefs about which mechanism is most likely operating

If we have already seen \(I=1, D=1\)

  • Linked positive effects requires us to then see: \(M=1\)

  • Linked negative effects requires us to then see: \(M=0\)

  • So which \(M\) value is more consistent with a positive \(I \rightarrow D\) effect depends on which of these is more common in the world:

    • Linked positive effects?
    • Linked negative effects?
  • We need to draw on our theoretical beliefs

    • about the distribution of causal effects in the relevant population

2.10 Combining beliefs across the DAG

  • Suppose we believe that:

    • \(I\) has a positive effect on \(M\) 30% of the time (= in 30% of cases)
    • \(I\) has a negative effect on \(M\) 10% of the time
  • Suppose we believe that:

    • \(M\) has a positive effect on \(D\) 60% of the time
    • \(M\) has a negative effect on \(D\) 10% of the time
  • Then the probability of linked positive effects is simply: \(0.3 \times 0.6 = 0.18\)

  • Probability of linked negative effects is simply: \(0.1 \times 0.1 = 0.01\)

  • So we believe linked positive effects are a MUCH more common way of generating a positive \(I \rightarrow D\) effect than linked negative effects

2.11 From beliefs to the meaning of the evidence

  • We believe linked positive effects are a MUCH more common way of generating a positive \(I \rightarrow D\) effect than linked negative effects

  • Given \(I=1\) and \(D=1\), there can only be linked positive effects IF mobilization occurs (\(M=1\))

  • So if we observe \(M=1\), we will think it’s more likely that high inequality DID cause democratization

    • Because \(M=1\) is consistent with the MORE likely way this effect could have happened
  • If we observe \(M=0\), we think it’s less likely that high inequality caused democratization

    • Because \(M=0\) is consistent only with the LESS likely way this effect could have happened

2.12 How the meaning of the evidence hinges on beliefs

Alternatively…

  • Suppose we believe that:

    • \(I\) has a positive effect on \(M\) 20% of the time (= in 30% of cases)
    • \(I\) has a negative effect on \(M\) 20% of the time
  • Suppose we believe that:

    • \(M\) has a positive effect on \(D\) 10% of the time
    • \(M\) has a negative effect on \(D\) 60% of the time
  • Probability of linked positive effects is: \(0.2 \times 0.1 = 0.02\)

  • Probability of linked negative effects is: \(0.2 \times 0.6 = 0.12\)

  • Now linked negative effects are a MUCH more common way of generating a positive \(I \rightarrow D\) effect than linked positive effects

  • Under these theoretical beliefs, \(M=0\) would be more consistent than \(M=1\) with a positive \(I \rightarrow D\) effect

2.13 Process tracing in Malawi

To recap:

  • We want to know if \(I=1\) caused \(D=1\) in Malawi

  • Given our DAG, there are two ways to generate a positive effects of \(I\) on \(D\)

    • Two ways that high inequality could have caused democratization in Malawi
    • Linked positive effects or linked negative effects
  • When we see \(M=1\) in Malawi, what we should conclude about \(I\)’s effect on \(D\) depends on how we think the world works

    • What kinds of effects do we think are most common at each step
  • This defines what ways of generating a positive \(I \rightarrow D\) are most/least common

  • Which tells us which value of \(M\) is most consistent with such an effect

2.14 So: Process tracing is always theory dependent

  • The evidence in process tracing never speaks for itself

  • Our inferences from PT evidence always depend on theory

    • On our beliefs about how the world works
  • Most process tracing is either silent about these beliefs or expresses them informally

  • We can formalize these beliefs

    • Maximize explicitness and analytic transparency

2.15 In practice: Process tracing in Malawi

  • Prior beliefs we use in the book (roughly):

    • \(I\) has positive effect on \(M\) 50% of the time (given other data)
    • \(I\) never has a negative effect on \(M\)
    • \(M\) has positive effect on \(D\) 50% of the time
    • \(M\) never has a negative effect on \(D\)
  • We process trace and observe that mobilization occurred: \(M=1\)

  • Given our priors, this observation increases our confidence that \(I=1\) caused \(D=1\)

2.16 Where do these prior beliefs come from?

  • Literature
    • We draw on leading works on inequality and democratization (Boix, Acemoglu and Robinson, Ansell and Samuels)
  • Deductive theorization
    • E.g., a formal model (see Chapter 6.3)
  • Crowd-sourcing
    • Survey experts
  • Cannot read directly off population-level descriptives
    • E.g., the number of \(I=1, M=1\) cases tells us nothing about how often inequality causes mobilization

3 Democratization example

Process tracing for several cases

3.1 Process tracing for several cases

  • With different possible clues we could observe

3.2 Process tracing to answer a pathway question

  • Similar idea for answering a pathway question
  • Observing \(M\) can help us figure out how \(I\) caused \(D\)
  • Once we provide population-level beliefs about causal effects

3.3 Other pieces of evidence

  • \(P\) can also be informative about \(I\)’s effect on \(D\)
  • Will depend on our population-level beliefs about \(P\)’s effects on \(D\) and interactions with \(M\)

4 General strategy for process tracing with causal models

  • We can figure out from the causal model which causal types are consistent with our query
  • We can figure out from the causal model which causal types are consistent with the data we observe
  • If we have probabilities for each of these we can figure out the probability of the query given the data

4.1 Key Bayesian insight

Figure 1: Logic of simple updating on arbitrary queries.

4.2 From the DAG alone…

We can tell when some evidence might potentially matter

4.3 From a full model…

We can say much more… actually making (model dependent) inferences

4.4 Process tracing with causal models

Key advantages of using a causal model:

  • Makes clear how process tracing is always theory-dependent
  • Provides a way of systematically integrating background knowledge (theory) into inference
  • Makes highly explicit/transparent how we get from evidence to conclusions
  • Enhances evaluability of findings
    • If I have different background beliefs from you, I can work out how my conclusions from same evidence would be different

5 Procedure for process tracing

5.1 By hand: Procedure

  1. Make a table with rows for all causal types (there may be many if you are doing by hand!!)
  2. Add a column to indicate your priors over these causal type
  3. Add a column to say if the query is satisfied by the causal types
  4. Calculate the conditional distributions given the types

5.2 By hand: Example

Our DAG is:

\(X \rightarrow M \rightarrow Y\)

And we believe:

  • \(X=1\) for half the cases, randomly
  • \(X\) has a positive effect on \(M\) for half the cases (“causes”), in the other half \(M=0\) regardless of \(X\)
  • \(M\) has a positive effect on \(Y\) for half the cases, in the other half \(Y=0\) regardless of \(M\)

5.3 By hand: Example

What are the types? How likely is each one? How likely is each given the data?

  • Which ones satisfy the following query: \(Y = 0\) because \(X=0\)?:
  • Which ones are consistent with data: \(X = 0, Y = 0\)
Type X M Y prob Query? Data ?
X = 0, X causes M, M causes Y 0 0 0 1/8
X = 0, X causes M, M does not cause Y 0 0 0 1/8
X = 0, X does not cause M, M causes Y 0 0 0 1/8
X = 0, X does not cause M, M does not cause Y 0 0 0 1/8
X = 1, X causes M, M causes Y 1 1 1 1/8
X = 1, X causes M, M does not cause Y 1 1 0 1/8
X = 1, X does not cause M, M causes Y 1 0 0 1/8
X = 1, X does not cause M, M does not cause Y 1 0 0 1/8

5.4 With CausalQueries: Step 1

Define a model

model <- 
  make_model("X -> M -> Y") |>
  set_restrictions("M[X = 0] == 1") |>
  set_restrictions("Y[M = 0] == 1")

query <- "Y[X=1] > Y[X=0]"

5.5 With CausalQueries: Step 2

Get types consistent with query

get_query_types(model, query)

Causal types satisfying query's condition(s)  

 query =  Y[X=1]>Y[X=0] 

X0.M01.Y01  X1.M01.Y01


 Number of causal types that meet condition(s) =  2
 Total number of causal types in model =  8

5.6 With CausalQueries: Step 3

Get mapping from causal types to consistent data types

inspect(model, what = "ambiguities_matrix") 

ambiguities_matrix (Ambiguities matrix)
Mapping from causal types into data types:
         X0M0Y0 X1M0Y0 X1M1Y0 X1M1Y1
X0M00Y00      1      0      0      0
X1M00Y00      0      1      0      0
X0M01Y00      1      0      0      0
X1M01Y00      0      0      1      0
X0M00Y01      1      0      0      0
X1M00Y01      0      1      0      0
X0M01Y01      1      0      0      0
X1M01Y01      0      0      0      1

5.7 With CausalQueries: Step 4

Get prior probabilities of each causal type

CausalQueries:::get_type_prob(model)
[1] 0.125 0.125 0.125 0.125 0.125 0.125 0.125 0.125

5.8 Put it all together

model  |>
  grab(what = "ambiguities_matrix") |>
  data.frame() |>
  mutate(
    in_query = get_query_types(model, query)$types,
    priors   = CausalQueries:::get_type_prob(model)) |>
  kable()
X0M0Y0 X1M0Y0 X1M1Y0 X1M1Y1 in_query priors
X0M00Y00 1 0 0 0 FALSE 0.125
X1M00Y00 0 1 0 0 FALSE 0.125
X0M01Y00 1 0 0 0 FALSE 0.125
X1M01Y00 0 0 1 0 FALSE 0.125
X0M00Y01 1 0 0 0 FALSE 0.125
X1M00Y01 0 1 0 0 FALSE 0.125
X0M01Y01 1 0 0 0 TRUE 0.125
X1M01Y01 0 0 0 1 TRUE 0.125

5.9 In one go with CausalQueries

query_model(model, query, given = c("X==0 & M ==0 & Y == 0"))

Causal queries generated by query_model (all at population level)

|label                                       |using      | mean|
|:-------------------------------------------|:----------|----:|
|Y[X=1] > Y[X=0] given X==0 & M ==0 & Y == 0 |parameters | 0.25|

Also try our shiny app

6 References